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Description 



DNA, RNA AND A PROTEIN USEFUL FOR DETECTION OF A MYCOBACTERIAL INFECTION 

Description 
Technical field 

The invention is in the field of clinical medicine, molecular biology and genetic 
engineering. More particularly, it relates to the molecular methods of 
tuberculosis diagnosis using newly identified DNA sequences which can be used 
as probes for DNA hybridization and or for DNA amplification leading to the 
identification of pathogenic mycobacteria causing disease in humans and 
animals. 



Background 



Tuberculosis, an infectious disease mainly caused by respiratory infection with 
Mycobacterium tuberculosis, represents an important subject of multidisciplinary 
investigation owing to the urgent need for rapid and reliable diagnostic tests and 
effective vaccines for disease control. 

An estimated 8 million persons are developing tuberculosis each year and this 
number will be rising for the foreseeable future. Especially immuno- 
comprimised people, e.g. Human Immunodeficiency Virus-infected individuals 
(Selwyn et al., 1989; Barnes et al., 1991) and the population of countries with 
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insufficient public health systems (Grzybowski, 1991; Kochi, 1991) are the most 
endangered groups of this "global disease" (WHO, 1992). Emergence of multiple 
drug resistant strains is posing major threat to human health not only in 
developing countries, but also in developed countries. A rapid and specific 
diagnosis of tuberculosis is still a problem. 

One approach to address this problem is to use the specific humoral or cellular 
response of the host to infer the presence of disease. Mycobacteria are rich in 
antigens that stimulate the production of antibodies and serology is simple and 
readily applicable as a rapid diagnostic test (Wilkins, 1994). Unfortunately the 
usefulness of serological tests are often limited by their lack of specificity and by 

their inability to destinguish between active disease, prior sensitization by 
contact with M tuberculosis or cross-sensitization to other mycobacteria. 
Another means of achieving the correct diagnosis are to develop increasingly 
sensitve methods to detect the causative bacilli or their products. Such techniques 
include amplification of a defined region of bacterial DNA via polymerase 
chain reaction (PCR) (Shankar et al., 1991), immunoassays for detecting antigen, 
gas liquid chromatography and mass spectrometry for detecting specific 
mycobacterial lipids. Of these, PCR is being evaluated most intensely and 
appears to hold greatest promise. 

Attempts have been made to develop methods for the detection of chromosomal 
DNA of the M tuberculosis complex in patient's sputum (Glennon, 1994). While 
the possibility of developing a DNA probe to distinguish between the M 
tuberculosis complex and other mycobacterial strains has been reported, strain 
differentiation within the individual members of the complex is still a problem. 

In this study we report the isolation of novel genomic clones containing as yet 
unreported genes and DNA, and the identification of novel M tuberculosis 
chromosomal DNA regions specific for species of the M tuberculosis complex. 
In addition, amplification of (i) a 377 bp fragment specific for the M 
tuberculosis complex and (ii) of a 380-bp fragment showing sequence 
similarities with the genome of Mycobacterium asiaticum, Mycobacterium 
gastrin Mycobacterium gordonae and Mycobacterium kansasii are described. The 
utility of the 377-bp and the 380-bp fragment for the differentation of species and 
strains of mycobacteria is reported. In addition to other ORF identified in this 
study, a novel ca. 15kDa recombinant protein showing high homology to a 
family of transposase was overproduced in Eschericha coli as a thioredoxin 
fusion and purified. The ca. 15kDa and ca. 3 lkDa proteins described in this study 
are different from the 35kDa ORF belonging to an insertion element identified by 
Marianietal.(1993). 
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Disclosure of invention 

The present invention is based on novel DNA sequences cloned from the 
genome of Mycobacterium tuberculosis, which can be used for strain 
differentiation and for the diagnosis of tuberculosis. 

Accordingly, the DNA sequences of the cloned fragments is an aspect of the 
invention. 

The cloned DNA fragments are found to code for at least 7 proteins of about 
9kDa, 15kDa, 17kDa,31kDa, 55kDa, 74kDa and 77kDa, the sequences of which 
are another aspect of the invention. 

The use of the DNA sequence for detecting specific fragments by hybridization 
or by DNA amplification is another aspect of the invention. 

The use of the cloned DNA or of the proteins coded by the cloned DNA for the 
purpose of serology, skin testing, vaccine development or drug design is another 
aspect of the invention. 
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The object underlaying the invention is solved by the following 
three main embodiments with their preferred embodiments. 

According to a first embodiment the invention concerns a 
DNA 

(a) having sequence (I) according to figure 9, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double-stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double-stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
according to (a) , (b) , (d) and (f ) , respectively, at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino a. d 
sequence (II) according to figure 13. 

The protein according to the invention can be an about 74 kDa 
protein. 
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Further the invention concerns a protein having the amino acid 
sequence (III) according to figure 14. 

The protein according to the invention can be an about 77 kDa 
protein. 

Further the invention concerns a protein having the amino acid 
sequence (IV) according to figure 15. 

The protein according to the invention can be an about 9 kDa 
protein. 

Further the invention concerns a protein having the amino acid 
sequence (V) according to figure 16. 

The protein according to the invention can be an about 55 kDa 
protein. 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungal strain or a cell line of a 
higher eucaryote. 

The protein according to the invention can be encoded by a DNA 
sequence according to the first embodiment of the invention and 
can be recovered by a method comprising the following steps: 
(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 
<ii) selecting a protein showing an inhibitory effect and 
(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 
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(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals . 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the first embodiment 
of the invention and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 
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According to a second embodiment the invention concerns a DNA 

(a) having sequence (VI) according to figure 2, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
according to (a) , (b) , (d) and (f ) , respectively, at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino acid 
sequence (VII) according to figure S. 

The protein according to the invention can be an about 15 kDa 
protein. 



Further the invention concerns a protein having the amino acid 
sequence (VIII) according to figure 6. 
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The protein according to the invention can be an about 31 kDa 
protein. 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungal strain or a cell line of a 
higher eucaryote. 

The protein according to the invention can be a encoded by a DNA 
sequence according to the second embodiment of the invention and 
can be recovered by a method comprising the following 9teps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals . 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the second embodiment 
of the invention and 
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(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention* for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 

According to a third embodiment the invention concerns a DNA 

(a) having sequence (IX) according to figure 3, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or <b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

Further the invention concerns a DNA according to (c) or (e) , 
its single strands being hybridizable with those of the DNA 
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according to (a) , (b) , (d) and (f ) , respectively, at a 
temperature of at least 25 °C and at a concentration of NaCl of 
1 M. 

Further the invention concerns a RNA being a transcript of a DNA 
according to the invention. 

Further the invention concerns a protein being encoded by a DNA 
according to the invention. 

Further the invention concerns a protein having the amino acid 
sequence (X) according to figure 7. 

The protein according to the invention can be an about 17 kDa 
protein. 

The protein according to the invention can be a recombinant 
protein, especially a protein produced by means of a bacterial 
strain, a yeast strain, a fungicidal strain or a cell line of a 
higher eucaryote. 

The protein according to the invention can be encoded by a DNA 
sequence according to the third embodiment of the invention and 
can be recovered by a method comprising the following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 

Further the invention concerns a DNA, RNA or protein according 
to the invention which can be used for 

(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals. 
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each especially by means of samples taken from humans or 
animals . 

Further the invention concerns a use of a DNA according to the 
invention for the identification of mycobacteria in media 
samples . 

The foregoing use can comprise the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to the third embodiment 
of the invention and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

Further the invention concerns a use, wherein as samples 
clinical samples are used. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

Further the invention concerns a use of a DNA according to the 
invention or of a protein according to the invention for the 
development of drugs useful for combating mycobacterial 
infections of humans or animals, especially tuberculosis, 
especially for testing and recovering of substances inhibiting 
mycobacterial infections in humans and animals, especially 
tuberculosis. 
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The invention is explained in detail by the following figures 
and experimental data. 

Fig- 1 shows a restriction endonuclease map of the 7.2 kb M. 
tuberculosis chromosomal region; 

Fig. 2. shows a 2253 bp M. tuberculosis chromosomal region 
including BamHI, EcoRI and Kpnl restriction sites and 
oligonucleotides for screening the lambda gt 11 M. tuberculosis 
library (Primer 1 and Primer 2 underlined) and for amplification 
of the 377 bp region (377 bp region in bold, Primer 3 and Primer 
4 underlined) ; amino acid sequences of the about 15 kDa and the 
about 31 kDa proteins are shown above the DNA sequences and are 
marked with arrows (small arrow about 15 kDa ORF 1, strong arrow 
about 31 kDa ORF 2) ; 

Fig. 3. shows a DVJA sequence of the 440 bp M. tuberculosis 
chromosomal region including the 380 bp region (in bold) used in 
PCR experiments and the amino acid sequence of the ORF 3 shown 
below the complementary DNA strand (< ORF 3) ; 

Fig. 4 is an overview of the isolated lambda gtll-clone C9-2; 
7.2 kb insert fragment, sequenced chromosomal regions and ORF 1, 
ORF 2 and ORF 3 marked with arrows; 

Fig. 5 shows the amino acid sequence of the about 15 kDa protein 
(ORF 1) ; 

Fig. 6 shows the amino acid sequence of the about 31 kDa protein 

(ORF 2) ; 

Fig, 7 shows the amino acid sequence of the about 17 kDa 
protein; 
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Pig. 8 A shows SDS-PAGE of the insoluble pellet fraction (lane 
1) and the purified about 15 kDa recombinant antigen (lane 2) ; 
lane A3 shows protein molecular weight standards (2.850 to 
43.000 molecular weight range); 

Fig 8 B shows SDS-PAGE of the purified about 15 kDa thioredoxin 
fusion protein (lane 1) and the two protein bands obtained after 
enterokinase cleavage (lane 1) ; 

Fig. 9 shows a DNA sequence of M. tuberculosis; 

Fig. 10 is a schematic drawing of the clone Mtub-Clara-Klon; the 
open reading frames of about 9 kDa (bp 3536 to bp 3829), 55 kDa 
(bp 2111 to bp 3829), 74 kDa (bp 1538 to bp 3829) and 77 kDa (bp 
2698 to bp 2 on the complimentary strand) proteins are shown by 
arrows and the corresponding coding regions are numbered; 

Fig. 11 A shows are southern hybridization with genomic DNA from 
different mycobacteria digested with PvuII (1: M. tuberculosis 
H37Rv; 2: M. avium; 3: M. kanssasi; 4: M. necroti; 5: M. 
fortuitum; 6: M. phlei; 7: M. smegmatis; 8: M. vaccae) ; 

Fig. 11 B shows a finger-print obtained using the DNA (BamHI 
digest) of (1) M. tuberculsosis H37 RV, (2) M. tuberculosis H37 
Ra, (3) M. bovis BCG, and (4) M. tuberculosis H37Rv digested 
with Sail; 

Fig. 12 shows a finger-print with DNA from different M. 
tuberculosis clinical isolates (numbered 1 to 12) digested with 
PvuII restriction enzyme; the 4 kb Sal I fragment (Mtub-Klar- 
Klon) was used as probe; 

Fig. 13 shows an amino acid sequence of the protein of about 74 
kDa (molecular weight 74999, length 764) 

Fig- 14 shows a glycine rich protein of about 77 kDa (molecular 
weight 77056, length 899); 
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Fig. 15 shows the amino acid sequence of the about 9 kDa proline 
rich protein (molecular weight 9356, length 98); and 

Fig, 16 shows the proline rich protein of about 55 kDa 
(molecular weight 55982, length 573). 
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Modes for Carrying out the invention 

We were interested in identifying and cloning novel DNA sequences from the 
genome of Mycobacterium tuberculosis for use in rapid and specific diagnosis 
of tuberculosis. Our strategy was to go for new repeated elements and insertion 
elements which are present only in M.tuberculosis or in the strains of M 
tuberculosis complex. 



Examples 

The following examples further describe the isolation and sequencing of M. 
tuberculosis-DMA containing putative IS-element (Insertion Element) and repeat 
sequences, e.g., PGRS-elements (Polymorphic GC-Rich-Sequences) and the use 
of the as yet unreported DNA sequences for strain identification and diagnosis 
of tuberculosis. 



Escherichia coli strains, phages and plasmids: The Escherichia coli K12 strain 
Y1090r " (Huynh et al., 1985) was used to propagate the A,gtl 1 library and the E. 
coli K12 strain GI724 (Invitrogen, Leek, The Netherlands) was the host for the 
production of the ca. 1 5kDa protein fused to thioredoxin. 
The recombinant DNA library of M. tuberculosis genomic DNA in the Jlgtl 1 ex- 
pression vector was constructed by Young et al. (1985). 
The plasmid vector pTrxFus (Invitrogen, Leek, The Netherlands) was used to 
make an in-frame fusion with thioredoxin as an amino-terminal fusion partner. 

Mycobacterial strains and preparation of cell extracts: The mycobacterial strains 
used in this study are shown in Table I (Results and Discussion). All organisms 
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were grown on Loewenstein medium. For preparing cell extracts a loop of bacteria was 
suspended in 0.5 ml of 10 mM Tris/base, 1 mM EDTA (pH 7.4) followed by addition of 0.5 ml 
glass beads (150-212 microns, Sigma, Deisenhofen, Germany). The suspension was incubated 
at 80°C for 10 min followed by a 1 min treatment in a Mini-Bead Beater (Biospec Products). 

DNA sequence analysis: Similarity comparisons were done using the BLAST program 
(Pearson and Lipman, 1988; NCBI computing facility). 

All DNA manupulations were done according to standard procedures (see Maniatis et al. 

1982). 

DNA sequencing: DNA sequencing analysis was performed by the dideoxynucleotide- 

chain termination method using a PCR sequencing kit (AB1 PRISM™ Dye Terminator Cycle 
Sequencing Ready Reaction Kit, Perkin Elmer, Warrington, Great Britain) on a 3 73 A DNA 
Sequencer (Applied Biosystems, Warrington, Great Britain). DNA sequences were determined 
for both strands by primer walking. 

1. Clone containing putative IS-Element 
1 . 1 Isolation of the clone C9-2 containing a putative IS element: 
In our attempt to isolate new mycobacterial insertion elements, a Xgil 1 M. tuberculosis library 
was screened with oligodeoxyribonucleotide primers based on conserved regions of different 
insertion elements.The library was screened as described by Young and Davis (1985). Briefly, 
phage-infected cells of the strain E.coli Y1090r " were plated in top agar on Luria-Bertani plates 
(7.0 x 10 6 PFU per 85 mm plate) and incubated for 6-8 h at 42°C. Nylon membranes (Biodyne 
B Transfer Membrane, 0.45 \xm, Pall, Portsmouth, England) were overlaid on plates. The filters 
were treated with 0.5 N NaOH, 1 .5 M NaCl and the DNA was fixed via UV-crossiinking. 
Screening was performed using 3'-end labeled oligonucleotides of the sequence 5- 
TGACGCGAGTGGGTGTGATTTCG-3' and 5'-GTGGTCGAGCCGTTGATGCCG-3' (Fig.2, 
PRIMER 1 and PRIMER 2) . Digoxigenin-labeling of the oligodeoxyribonucleotide primers 
was carried out using a DIG Oligonucleotide 3'-End Labeling Kit (Boehringer Mannheim, 
Germany). Hybridzation was done at 45°C in hybridization buffer (Boehringer Mannheim, 
Germany) overnight. Then the membranes were washed under stringent conditions for 5 min 
twice in 2 x SSC, 0.1% SDS and for 15 min twice at 37°C in 0.1 x SSC, 0.1% SDS. Chemi- 
luminescent detection was carried out with the help of a DIG Luminescent Detection Kit 
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(Boehringer Mannheim, Germany). Plaques were purified by three rounds of plating to obtain 
single plaques. Phage DNA was isolated using a Nucleobond AX L50 Kit (Machery-Nagel, 
Diiren, Germany) and restriction mapping of the selected clone was performed by standard 
procedures (Maniatis et al., 1982). 

Several positive clones were obtained. Detailed analysis of one of the clones (C9-2) is 
presented here. The recombinant phage was mapped with the restriction endonucleases BamHI, 
EcoRl and Kpnl (Fig. 1). EcoRI digestion revealed a 7.2 kb DNA insert fragment. 

1 .2 DNA sequencing of the cloned fragment: 

Two M. tuberculosis chromosomal regions of 2253-bp and 440-bp of this fragment were 
sequenced (Fig.2 and Fig.3). DNA sequencing of the 2253-bp region revealed the presence of a 
putative insertion element between bp 401 and bp 1378 containing inverted repeats flanked by 
duplications of 4 base pairs. The cloned fragment reported here is novel and is located at a 
different position than the 2.1 kb Pstl/EcoRI fragment reported by Mariani et al. (1993), 
because the DNA sequence of the adjoining regions on the left and the right ends of the putative 
IS-element were completely different in our clone C9-2 as compared to that reported by Mariani 
etal. (1993). 

Fig.4 gives an overview of the 7.2 kb insert fragment and the sequenced chromosomal regions. 

1 .3 Novel Proteins coded by the cloned DNA: 

During the molecular characterization of the clone, novel ORFs were identified. The complete 
ORF of the ca. 15kDa protein is located on the 2253-bp fragment coded by a 408-bp fragment, 
corresponding to a coding capacity of 136 amino acids. The ca. 1 5kDa protein (Fig.5) is a novel 
product showing limited homology in the N-terminus of a 34kDa ORF reported by Mariani et 
al. (1993). We also identified an ORF of about 31kDa (Fig. 2 and Fig. 6) coded by the cloned 
DNA (bp 515 till bp 1378). This 31kDa ORF did not show any homology in the N-terminus to 
any known sequence in the database. The C-terminus of the ca. 31kDa protein showed 
homology to a 34kDa ORF (Mariani et al., 1993). We have not used the DNA sequence 
showing homology to the sequence reported by Mariani et al. (1993) as far as the claims of this 
patent application are concerned. An ORF (ORF 3, Fig. 3 and Fig.7) on the complementary 
strand to the 3*-end of the insert fragment of the recombinant X-clone C9-2 was identified, 
which had not been reported earlier. This sequence showed homology to a family of 
transcription regulators in microorganism. In addition, some homology was observed with a 
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putative two-component system mtrA-mtrB isolated from M.tuberculosis H37Rv (Via et ah, 
1996) and to PhoP of Bacillus subtilis (Lee and Hulett, 1992). Based on this data, the DNA 
sequence (440-bp fragment, Fig. 3) and the derived polypeptide might play a role in regulation 
of virulence in mycobacteria. 

1.4 Cloning, expression and purification of the ca. 15kDa protein fused to 
thioredoxin 

The ?igtl 1 clone C9-2 (Fig. 4) was used as template to amplify a PCR fragment of 951- 
bp (Fig. 2, sequence position 45 1-1378) including the ORF for the ca. 1 5kDa protein (Fig. 5) 
and cleavage sites for the restriction endonucleases Smal and Sail at the 5'- and 3-ends. 
Amplification of the Smal-Sall mycobacterial DNA fragment for insertion into pTrxFus 
(Invitrogen. Leek, The Netherlands) was done using the oligonucleotide primers with the 
sequence 5'-TCTAGACATATGACGCGAGTGGGTGTGATTTCG-3 , (PRIMER 7, forward) 
and 5'-CATATGGTCGACCTAGGGCGTGTCTCCCAA-3' (PRIMER 8, reverse) 
corresponding to sequence positions 451-474 and 1378-1361 (Fig. 2). Composition of the 
reaction mix was the same as described above with 400 ng phage DNA as template. The probe 
was amplified in 30 cycles consisting of the same conditions as described. Cleavage sites were 
introduced by appropriate primers. After digestion with both restriction endonucleases the 
product was inserted in pTrxFus (Invitrogen, Leek, The Netherlands) to form the plasmid 
pCH3-8. 

The E. coli strain GI724 was electroporatcd with the plasmid pCH3-8; Bacterial cultures (200 
ml of Induction Medium (Invitrogen, Leek, The Netherlands) supplemented with 100 pg/ml 
ampicillin) grown at 30°C were induced to synthesize the fusion protein by tryptophan addition 
(100ng/ml) and temperature shift to 37°C. Cells were collected after 4 hours (10 000 x g, 5 min, 
4°C), resuspended in 4 ml Osmotic Shock Solution (Invitrogen, Leek, The Netherlands), broken 
by three rounds of alternate sonication on ice (10 sec.) and shock freezing in liquid nitrogen, 
and pelleted (10 000 x g, 15 min, 4°C). Most of the fusion protein accumulated in the fonn of 
inclusion bodies and only a small fraction was present as soluble protein inside the cells.The 
pellet containing the inclusion bodies was resuspended (denaturation) in 10 ml 6 M 
guanidine/HCl (pH 8.5), incubated for 2 hours at room temperature and pelleted again (10 000 x 
g, 30 min, 4°C). The recombinant fusion protein was refolded by dialysing against 50 mM 
Tris/HCl (pH 8.0). Anion exchange chromatography was done with the help of a BioCAD 
perfusion system (Perseptive Biosystems) on a Poros column HQ/M (Perseptive Biosystems). 
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Elution was performed using a linear NaCl gradient (0-1M). The fusion protein concentration 
was determined with the BioRad Protein Assay Kit (BioRad, Munich, Germany). Purity was 
assessed by densitometry (Molecular Dynamics, Software Image Quant) and analytical SDS- 
PAGE and coomassie staining. 

The ca. 15kDa protein fused to thioredoxin was refolded as described above. Further 
purification of the ca. 1 5kDa protein fused to thioredoxin was carried out by anion exchange 
chromatography (Fig. 8, A lane 3 and B lanel). After enterokinase cleavage of the purified ca. 
15kDa protein fused to thioredoxin two protein bands were detectable on SDS-PAGE (Fig. 8, 
lane 2). By western blotting with a thioredoxin monoclonal antibody the lower 1 lkDa band was 
identified to be thioredoxin. The upper band corresponds to the ca. 15kDa recombinant protein 
of M. tuberculosis. This is the first report of expression and purification of the ca. 15kDa 
protein of M. tuberculosis in E. coli. 

1 .4. Species specific diagnosis of mycobacteria : 
Deprotected and desalted Oligonucleotide primers were obtained from Gibco BRL (Eggenstein, 
Germany) or Eurogentec (Seraing, Belgium). 

The oligodeoxyribcnucleotide primers with the sequence 5-GTCCATGTGCCGCCG 
CTG-3' (PRIMER 3, forward) and S'-CTGCGCGGCTCCCGGCAO 1 (PRIMER 4, reverse), 
specific for the DNA regions of the 2253-bp M. tuberculosis chromosomal region shown in Fig. 
2 were used in PCR experiments to amplify a 377-bp fragment. 

For amplification of a 380-bp fragment from the 440-bp chromosomal fragment, the 
oligodeoxyribonucleotide primers with the sequences 5 -CGAGGCTGAACGGCT TTG-3' 
(PRIMER 5, forward) and 5'-TCAACGTCCGCGGCAAGC-3* (PRIMER 6, reverse) 
corresponding to the DNA region shown in Fig. 3 were used. Amplifications were performed in 
0.2 ml Micro Amp Reaction Tubes (Perkin Elmer, Norwalk, Connecticut, USA) in a final 
volume of 100 nl using a GeneAmp® PCR Kit (Perkin Elmer, Branchburg, New Jersey, USA). 
Reaction mixtures contained 10 mM Tris/HCl (pH 8.3), 50 mM KC1, 3 mM MgCl 2 , 200 \iM 
dNTP, 0.1 nM Primer, 30-100 ng chromosomal DNA from mycobacterial cell extracts (Table 
1) and 2.5 U AmpliTaq® DNA polymerase. All components of a PCR reaction except for the 
template are included in the Kit. The reactions were performed using the automated Thermal 
Cycler Gene Amp PCR System 9600 (Perkin Elmer, Norwalk, Connecticut, USA). The samples 
were amplified by 40 cycles consisting of denaturation at 96°C for 2 min, annealing of the 
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primers at 25°C for 1 min and primer extension at 72°C for 3 min. 

After amplification, 1 0 pi of each product was electrophoresed in a horizontal 1 .5% agarose 
gel. Gels were precasted using a 1:10 000 dilution of SYBR Green I stock reagent (Eugene, 
Leiden, The Netherlands) in 10 mM Tris/HCl, 1 mM EDTA (pH 8.0). 

For DNA sequencing the appropriate 377-bp and 380-bp PCR products from the mycobacterial 
cell extract samples (Table 1) were purified from an 1.5% agarose gel using a Gel Extraction 
Kit (QIAGEN, Hilden, Germany). 

1.4.1. The 377-bp region: 

The 377-bp region (Fig.2) of the isolated and sequenced 2253-bp M. tuberculosis chromosomal 
fragment and the 380-bp region (Fig.3) of the identified 440-bp chromosomal fragment were 
examined for their suitability for strain differentiation (Table 1). A PCR-product of the 
predicted size and a 100% DNA sequence homology in the 377-bp region was detected only in 
the members of the M. tuberculosis complex. No amplification product was obtained from other 
mycobacteria (Table 1). Therefore, the PCR primers of the 377-bp region are useful for the 
rapid discrimination of M. tuberculosis complex (M. tuberculosis, Mycobacterium bovis, 
Mycobacterium bovis BCG, Mycobacterium africanum and Mycobacterium microti) from 
other mycobacteria. 

1.3.2. The 380-bp region: 

A predominant amplification product of correct size of the 380-bp region was obtained from the 
chromosomal DNA samples of the M. tuberculosis complex including the vaccine strain M. 
tuberculosis BCG, the tuberculosis isolate Tubl 18 and the mycobacterial species M. asiaticum, 
M. gastri, M. gordonae and M. kansasii. Thus, this fragment can be used for the identification of 
above mycobacterial species, since no amplification product was obtained from other 
mycobacterial species (Table 1). 

2. Clone containing PGRS Element 
2.1. Cloning of DNA fragment containing PGRS elements: 
We screened Lawrist cosmid library of M. tuberculosis DNA using a degenerate 
oligonucleotide of the sequence 5'- 

C/GGCC/GGCC/GGGC/GACC/GGGC/GGGC/GGCCGGCTCC/GGG- 3' which was designed 
in such a way that it contained GC rich regions as well as it coded for a putative proline rich 
polypeptide. Colony hybridization using labelled oligonucleotide was performed using standard 
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procedures (Maniatis et al.1982). Filters were prehybridized and probed at 42°C overnight in a 
solution containing 6xSSC, 1 mM Sodium phosphate, ImM EDTA, 0.05% skimmed milk, 
0.5%SDS. Filters were washed twice in 2xSSC;0,3%SDS for 1 5 min at 65°C. 
First screeinig yielded six positive clones which were recheked by hybridization with the 
oligonucleotide. Three clones gave strong signal and restriction mapping of the clones showed 
identical restriction pattern. Further restriction mapping and Southern hybridization of one of 
the clones called identified an about 4kb Sail fragment that hybridized strongly to the 
oligonucleotide. 

2.2. DNA sequencing of the cloned fragment: The ca. 4kb Sail fragment was 
subcloned in pUC19 and the clone was named Mtub-Clara-Klon. Entire insert was sequenced 
by primer walking method. The DNA sequence is presented in figure 9. There were unusual 
difficulties in obtaining the sequence of the recobminant clone because of the high GC rich 
content and due to the presence of unusual repeats. 

2.3. Proteins coded by the cloned DNA: 

We identified at least 4 ORF (open reading- frames) belonging to a ca. 9kDa ? 55kDa, 74kDa and 
a 77kDa protein (Fig. 10). Interestingly, the amino acid sequence of the 9kDa, 55kDa ,74kDa 
and the 77kDa proteins didnot show strong homology to any sequences reported so far for 
Mycobacteria (Genbank and Swissprot Databases) . In addition, the 9kDa, 55kDa and the 
74kDa proteins have an unusually high content proline, nevertheless, no strong homology with 
the known proline rich antigens ( Laqueyrerie et al. 1995; Infect.Immun.63. 4003) of 
mycobacteria was observed. Unexpectedly, the amino acid sequence showed restricted 
homology to Mucein like proteins from eucaryotes. The 77kDa protein is highly rich in amino 
acid glycine and may be a cell wall protein of Mycobacterium tuberculosis. Such proteins have 
not been reported from M. tuberculosis. 

2.3. DNA finger-printing: 
The ca. 4kb Sail fragment was used to probe (Southern hybridization) genomic DNA of 
different mycobacteria digested by PvuII (Fig. 1 1). The results show that each strain showed a 
characteristic pattern making the differentiation of M. tuberculosis-Rv, M. tuberculosis-Ra, M. 
bovis and the M. tuberculosis Erdman strain. The ca. 4kb Sail fragment is also suitable for 
finger printing of clinical isolates, since hybridization of the probe to the genomic DNA of 
clinical isolates from tuberculosis patients also yielded strain specific finger print (Fig. 12). No 
hybridization to the genomic DNA of M. smegmatis, M. vaccae, M. avium. M. chelonie, M. 
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fortituim, M. phlei was observed. 
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Claims 



1. (I) DNA 

(a) having sequence (I) according to figure 9, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) ; or 

(II) DNA 

(a) having sequence (VI) according to figure 2, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 

(b) having a sequence complementary to said of (a) , 

(c) being single-stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double- stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 

(e) ; or J 

(III) DNA 

(a) having sequence (IX) according to figure 3, wherein 
optionally one or more condons can be replaced by condons coding 
for the same amino acid(s), 
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(b) having a sequence complementary to said of (a) , 

(c) being single -stranded, wherein its strand is hybridizable 
with that of the DNA according to (a) or (b) , 

(d) being double -stranded, the sequences of its single strands 
being defined as in (a) and (b) , respectively, 

(e) being double -stranded, its single strands being hybridizable 
with those of the DNA according to (d) , or 

(f) being a subsequence of the sequences according to (a) to 
(e) . 

2. A DNA according to claim 1 (I) (c) , (I) (e) , (II) (c) , (II) (e) , 
(III) (c) or (III) (e) , its single strands being hybridizable at a 
temperature of at least 25 °C and at a concentration of NaCl of 

1 M. 

3. RNA being a transcript of a DNA according to claim 1 or 2. 

4. Protein being encoded by a DNA according to claim 1 or 2 . 

5. Protein having the amino acid sequence (II) according to 
figure 13. 

6. An about 74 kDa protein according to claim 4 or 5. 

7. Protein having the amino acid sequence (III) according to 
figure 14 . 

8. An about 77 kDa protein according to claim 4 or 7. 

9. Protein having the amino acid sequence (IV) according to 
figure 15. 

10. An about 9 kDa protein according to claim 4 or 9. 
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11. Protein having the amino acid sequence (V) according to 
figure 16. 

12. An about 55 kDa protein according to claim 4 or 11. 

13. Protein having the amino acid sequence (VII) according to 
figure 5 . 

14. An about 15 kDa protein according to claim 4 or 13 . 

15. Protein having the amino acid sequence (VIII) according to 
figure 6 . 

16. An about 31 kDa protein according to claim 4 or 15. 

17. Protein having the amino acid sequence (X) according to 
figure 7. 

18. An about 17 kDa protein according to claim 4 or 17. 

19. A protein according to any of claims 4 to 18, wherein the 
protein is a recombinant protein, especially a protein produced 
by means of a bacterial strain, a yeast strain, a fungal strain 
or a cell line of a higher eucaryote. 

20. A protein being encoded by a DNA sequence according to claim 
1 or 2 and which can be recovered by a method comprising the 
following steps: 

(i) subjecting proteins encoded by said DNA sequence to a 
usual test for diagnosis of tuberculosis, ,; 

(ii) selecting a protein showing an inhibitory effect and 

(iii) isolating and recovering said protein. 



21. DNA according to claim 1 or 2, RNA according to claim 3 or 
protein according to any of claims 4 to 20 which can be used for 
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(i) diagnosis of tuberculosis in humans and animals and/or 

(ii) diagnosis of other mycobacterial infections in humans or 
animals, 

each especially by means of samples taken from humans or 
animals . 

22. Use of a DNA according to any of claims 1, 2 or 21 for the 
identification of mycobacteria in media samples. 

23. Use according to claim 22, comprising the steps of 

(i) isolating the mycobacterium, 

(ii) preparing crude or purified genomic DNA, 

(iii) hybridizing it to a DNA according to claim 1 or 2 and 

(iv) detecting the fragment pattern using conventional methods 
such as a radioactivity assay, chemiluminiscence or 
fluorescence . 

24. Use according to any of claims 22 to 23, wherein as samples 
clinical samples are used. 

25. Use of a DNA according to any of claims 1 to 2 or of a 
protein according to any of claims 4 to 20 for 

(i) epidemeological purposes and/or 

(ii) vaccination follow-up 

for humans or animals suffering from mycobacterial infections, 
especially tuberculosis. 

26. Use of a DNA according to any of claims 1 to 2 or of a 
protein according to any of claims 4 to 20 for the development 
of drugs useful for combating mycobacterial infections of humans 
or animals, especially tuberculosis, especially for testing and 
recovering of substances inhibiting mycobacterial infections in 
humans and animals, especially tuberculosis. 
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Y ~~*f f 5j5 50 pig# g 

GTCGACGTCT ACCGCACCTT CGTCGGCGAG ATGGACGACG AAGAGGCCGA 1/S 

6 |° 7 ,° 8 ,° 90 joo 

CCATCATTAC CGCGCGGGCA TGGCGATGGG CACCACGTTG CAGGTGCCGC 
*]<» 120 130 l|0 150 

CGCAGATGTG GCCACCGGAT CGGGCGGCCT TCGACCGCTA CTGGCGGCAA 
1 f° 170 "0 190 200 

TCACTGGACA GGGTGCACAT CGATGACGTC GTTCGCGACT ACCTGTATCC 
2 ]° 2 |° 230 240 250 

GATCGTGGCG CTCCGAATTC GCGGGATCGC ACTGCCGGGT CCGCTGCGGC 
2 f 2 ]° 2|«> 290 300 

GGCTGTCGGA GGGTATCGCG CTGCTGATCA CCACCGGTTT CCTGCCGCAG 
3 j U *f "0 3< 0 3S0 

CGGTTTCGCG ACGAGATGCG GTTGCCGTGG GACGCGACCA AGCAGCGGCG 
3 f° 3 ]° «0 390 4 00 

CTTTGACGCG CTCATGGCCG TGCTGCGCAC GGTGAATCGC CTGATGCCGC 
4 1° 420 "0 440 450 

GGTTTGTCCG GGAGTTCCCC TTCAACCTGA TGCTCTGGGA CCTGGACCGG 
4 f° 4 ]° «f0 «0 soo 

CGGATGAGGC GCGGGCGCCC GCTGGTGTAA TCGACGGCTT CGCGTGGACC 



J 3J ' 

GATGGCGGTA GACCGCTCGC TAGATTGGCG GGCGAATTTG GTGCACAGAG 



5 i° 5 ?0 530 540 550 

3GCGAAT1 

5 f° 5 ]° 5J0 590 600 

GCAAACCGGG CGAAATCCCT ATCCAGGCTC ACCACGGCGC AGTGATGCTC 
6 )° 620 "0 640 650 

CACGGCGATG GCCCCGAGTA CCGCGTCAGG TATCAAGTCG CCCGATGCci 
6 f° 6 ]° 600 690 700 

CGGCCTCGTC GCAGAGTTTT CGCAGCAGCA CCAGGTGTCT GGGGCCGGGG 

T ? f ? f 750 

CTTGTCGGAA GGTCATGGGG CTGGGCGTTG ACGGCTTCGA CGAATGCGAA 

? f° ']° 'f0 790 ' BOO 

TGCATCCGCT CGTGGTGACG GAATCTCGAA GATGCGTCGA TTCGTTGTTA 

B l° 820 "0 840 B50 

GCCGGAGGAA CGACGCCCAC ACTAGGTTCG GCACTGTGAA GGGGTCGTCG 



8 f° 8 |° 880 

GCCGCAAGCA GTCGATCGAA CCAGGGGCGG ACGGTTCGGT GATTCGGATC 
T 9 ] 0 »f 940 950 

GTCACCGCGG tgtgcagcca gcagcacgtt gacgtcgatc aggaacatcg 
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990 1000 

2/5 



9 f 9 ]° 9f0 990 1000 

CCTATTTGTG CCTGTCCAGG CTCACTTCCG CGAGTTCAGT TCCAGACCCT 



10 | 10 10 , 20 1030 1040 1050 

CGTCGAGCAC TTCGGACAAC ACCGTATTCG AGGTTAGGTC GATACCTGGC 

10 | 6 ° 10 , 70 10 , 80 1090 iioo 

CGCGGACCGG TGCCGGCGTC AAAAACGGGG ACGGTTGGCC GGGCGCCGCC 

T T n . 30 u . 40 1150 

GGTACGGGCG GCGGCGAGCT CCCGCCGAAG GGCGTCTTCG ATCACAGCGC 

"l 60 T n , 8 ° 1190 "CO 

CCAGCGATTA ACCACGCTCG CGGGCCCGGC GTTTGGCGGT AGCCAGTAGT 

12 i 10 T 12 , 3 ° 12 , 4 ° 1250 

;a ttgacacggt ggtgcgcatg ; 

1260 1270 1280 



I I 

TCATCCGAGA TTGACACGGT GGTGCGCATG ATGCTCAGGA TAGCGCATCT 

260 12 , 70 " 80 "M 1300 

ACGGCATCAT CTGCGGTGAG CAACTGATGC CCTCAACGCC GCGTGTGGTC 

13 | 10 13 | 2 ° 1330 "40 1350 

GCAGGTCTGC CTGCTATGGC AAGCCGTTGA GTCCGTTCTC GCCGAGCAGC 

13 | 6 ° 13 , 70 "60 1390 1400 

AGCCCGCCGG TGCCGCCGGC ACCGGGCGTG GCCCCGGCOT TGCCGGCGTT 



T 14 | 2 ° 1430 1440 "50 

GGCGCCGTTG CCGCCGTTGC CGATCAGCAC GGCGTTGCCG CCGACACCAC 

14 f° 14 , 70 «M 1490 1500 

CGCTGCCGCC GGTACCGGCG CCAAACCCGC CGGCAACCCC CGTCACCGCC 

l5 | 10 15 | 2 ° 15 , 30 1550 

GTTGCCGAAC ACCCCGGCGT GGCCACCGTC ACCGCCGGTG CCGCCGGTAC 

T° 15 | 7 ° 1580 1590 1600 



1560 1570 1580 

CGGCGCCTAG AGCGTTGGCA CCCCTGCCGC GGGCGCCGCC GGCGCCGGCG 

16 | 10 16 , 2 ° 16 , 30 165 

GAGCCGAAGA GCAAGCCGCC GTTCCCGCCG GCGCCGCCGG CGCCGCCTTG 



T ", 7 ° 1680 ",90 1700 

CTGGATGCTG GTAAGTGCTG CCCCGCCGTG CCCGCCGCCG CCGCCGGCGC 

l Y° 1? | 20 ",30 1740 1750 

CGCCGAAGCC GAAGAGTAAG GCGCCGTTCC CGCCGGTTCC GCCGGCCCCG 

"l 60 17 , 7 ° "fO 1790 1800 

CCGGCAAGGG AGCTGGCGCC ACCGCTGCCG CCGGCGCCAC CGGAGGCGCC 

T 18 | 2 ° 18 , 3 ° 18 , 40 "50 

GAGGGAGAGT AGGCCGGCGT TGCCGCCGTG CCCGCCGCCG CCGGTGGTGA 

T 18 | 7 ° 18 , 80 "?0 1900 

TCCCGGACCC TCCCGAGCCG GCGGCGCCGC CGGTGCCGCC GGCTCCGAAC 
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191° 19 f° 19 | 30 1940 19 , 50 9 

AGTCCGCCGT TCCCGCCGTT CCCACCGGCC CCGAAGTTCG TGCCGGCCCC 3/5 

1960 1970 1980 1990 2000 

GCCGGTGCCG CCAGTTCCGA ACAGTCCGCC GTTCCCGCCG TTCCCGCCGG 

2010 2020 2030 2040 2050 

CTGCGTTGAA CCCGCCGGCC CCTCCGGCTC CGCCGTTGGC GAACAGTCCG 

2060 2070 2080 2090 2100 

CCGTTGCCGC CGGCGCCGCC GACGCCGGCC GGGACACCGC CAGCGGCGCC 

21 , 10 21 , 20 21 , 30 2140 2150 

''III 
GTGGCCGCCG GTGCCGGCCG CGCCGAAGAG CAAACCGGCG TCGCCGCCGC 

21 j 60 21 | 70 21 , 80 2190 2200 

GCCCGCCGGC CCCGCCGATG CCAGCGACGC CTATGGAGTT CCCACCGTTG 

2210 2220 2230 2240 2250 

CCGCCGGTGC CGCCGGATCC GATCAGCAAG GAGACCCCAC CGGCGCCGCC 

22 j 60 22 j™ 2280 2290 2300 

GGCCCCGCCG ATCCCTCCAG CACCGGTGCC TATCCCGCCG GTCCCGCCAT 

23 ! 10 23 , 20 2330 2340 2350 

TGCCACCGGT ACCGAACAAG ATCCCGCCGG CCCCGCCGGC CCCGCCCGTA 

23 , 60 23 j>0 2380 2390 2400 

GCCGTGGCGG CGGTGTTGGT CGCACCGTGC CCGCCGTTAC CGCCGTTGCC 

24 , 10 24 , 20 2430 2440 2450 

GAACAACCAC CCGCCGGCCC CGCCGGCAGC CCCGGTCCCC GGGGTCCCCT 

24 , 60 2470 2480 2490 2 500 

TGGCGCCGTT GCCGAACAGC CACCCGCCGG CCCCGCCGTC AGCCCCGGTT 

25 j 10 25 | 2 ° 25 | 3 ° 2540 2550 

CCAGGAGTCC CGTTGGCGCC GTTGCCGATC AGCGGGCGGC CGGTGAGCGT 

25 , 60 25 , 7 ° 25 | 80 2590 "00 

CTGGAAGGGC TCGTTCACCA CATTGAGCAC ATTTTGCTGC AGGGTGTGCA 

2 «0 2620 2630 2640 1 2650 

GTGGCGAGGT GCTCGCGGGA GCATTGAATC CGTCTAGACC GAGCAGGAGC 

26 f° 26 , 70 2 « 80 2690 2700 

CCGCTGACGA CGACCACTCC GGCCTTGCCC GCGCCAATCC CACCGCTACC 

27 , 10 2720 2730 2740 2750 

GCCGTTACCG CCATTGCCGA TCAACACGGC GGTGCCACCG ATCCCGCCci 

27 , 60 27 , 70 2780 2790 2800 

TGCCGCCGGT CACCGCGCTG GCGCCACCGT TACCGCCGTT GGCGCCGTTA 

28 , 10 28 | 20 283 0 2840 2850 

CCGATCAGCC CGGGGGTGCC GCCAGCCCCA CCGATCCCGC CGGGGAAGCC 
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2860 2870 2880 2890 2900 

I I I I I Fig 

CTGGACAACT CCGCCGTTGG CGCCGGCGCC GCCGGAGCCG AAGACCGTGC 



2910 2920 2930 2940 2950 

I I I I I 

CGGTGTTGCC CCCGGGGCCG TCTTGCCCGC CGTCGGAGAA GCCGAATCCG 

2960 2970 2980 2990 3000 

I I I I I 

CCGGCGCCGC CGGAGCCGCC GGAGCCGAAG AGCAGCCCAG CGTTGCCGCC 

3010 3020 3030 3040 3050 

I I I I | 

GGCGCCGCCG GCGCCGtCTA TGCCGTCGGC CGTGAGAGTA CCGCCGTCCC 

3060 3070 3080 3090 3100 

I I I I | 

CACCGATTCC GCCGGCGCCG CCCGCGGCGC CGAGGGCGAG CATGCCGGCA 

3110 3120 3130 3140 3150 

TTGCCGCCGG CCCCGCCGTC CCCGCCGGCG ACCAGGCTGT GTCCGCCGCT 

3160 3170 3180 3190 3200 

GCCGCCTTCC CCGCCTGCGC CGAACAGCCC GCCGGCCCCG CCGGCCCCGC 

3210 3220 3230 3240 3250 

I I I I | 

CGACTCCGCC GAAGCTGCTG TCGGCGAACC CGCCATGCCC GCCGGTGCCG 

3260 3270 3280 3290 3300 

I I I I | 

CCGGCGCCGA ACAGACCGCC AGCGCCACCG GCCCCACCGG CCCCGCCGGA 

3310 3320 3330 3340 3350 

I I I I | 

GCTGCCGGCC CCACCGGATC CGCCGACCCC GCCGGTGGCG AACAGCCCGC 

3360 3370 3380 3390 3400 

I I I | | 

CGGCCCCGCC GGCGCCGCCC GCCCCGCCGA GTGCACTGCC GTTCGTGAAT 

341° 3420 3430 3440 3450 

CCGCCGGCCC CGCCGACTCC GGCGGCGCCG AAGAGCAGGC CGGCGTTGCC 

3460 3470 3480 3490 3500 

I I I | | 

GGCAGCCCCG CCGGCGCCGC CGGCCCCGCC CGTGAGGGCT ACTACGCCGC 

3510 3520 3530 3540 3550 

I I I I | 

CGCCGGCGCC GCCGGCGCCG CCGGCGCCGA ACAGCATGGC GTTGCCGCCG 

3560 3570 3580 3590 1 3600 

■ I I I I 

GCTCCGCCGG ACCCGCCGAT CCCACTGCTG GCGACCCCGC CAGCGCCGCC 

3610 3620 3630 3640 3650 

GGCGCCGCCG TTGCCGATGA GCCCGCCGGC GCCGCCGTTG CCGCCGGCCG 

3660 3670 3680 3690 3700 

< I I | I 

CGCCGGATCC TCCGGCGCCG CCGTTGACGA TTAACCAGCC GCCGTCCCCG 

3710 3720 3730 3740 3750 

CCATTGGCCC CGGTGCCGGG GGCGCCGTTG GCGCCGTTGC CGATCAACGG 

3760 3770 3780 3790 3800 

I I I | | 

GCGCCCGGTA TTCGCCAGGA AGAACTCGTT GATCGGATCC AGCAGCGGCG 



4/5 
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3810 3820 3830 3840 3850 Pig. 9 

ACACCGCGGC GGCCTCGGCG GCCGCATAGG CGCCGCCACC GGAGGTCAAT 5/5 

3860 3870 3880 3890 3900 

GCCTGCACGA ACTGGGCATG AAACGCCTGC GCTTGGGCGC TGAGCGCCTG 

3910 3920 3930 3940 3950 

I I I I I 

ATAGGCCTGG CCGTGGGCGC CGAACAGCGC GGCGATGGCT GTCGAC 



1 
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Fig. 1 1 A. Southern hybridization with genomic DNA from different 
mycobacteria digested with PvuII (1. M. tuberculosis H37Rv, 2. M. avium, 3. 
M. kanssasi, 4. M. necroti, 5. M.fortuitum, 6. M. phlei, 7. M. smegmatis, 8. M. 
vaccae. ) 



Fig. 11B 

Finger-print obtained using the DNA (BamHI digest) of l.M. tuberculosis H37 
RV , 2. M. tuberculosis H37 Ra, 3. M. bovis BCG, and 4. M. tuberculosis 
H37Rv digested with Sal I. 
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Fig 12. Finger-print with DNA fron differents M tuberculosis clinical isolates 
( numbered 1-12) digested with PvuII restricion enzyme. The 4 Kb Sal I 
fragment (Mtub-Klar-Klon) was used as probe. 
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Amino acid sequence of the protein of about 74kDa. 
Molecular-weight 74999 ; Length 764 
THE AMINO ACID SEQUENCE IS GIVEN BELOW: 

10 20 30 40 50 60 

VPPVPAPRAL APLPPAPPAP AEPKSKPPFP PAPPAPPCWM LVSAAPPCPP APPAPPKPKS 
70 80 90 100 110 120 

KAPFPPVPPA PPARELAPPL PPAPPEAPRE SRPALPPCPP PPWIPDPPE PAAPPVPPAP 

130 140 150 160 170 180 

NSPPFPPFPP APKFVPAPPV PPVPNSPPFP PFPPAALNPP APPAPPLANS PPLPPAPPTP 

190 200 210 220 230 240 

AGTPPAAPWP PVPAAPKSKP ASPPRPPAPP MPATPMEFPP LPPVPPDPIS KETPPAPPAP 

250 260 270 280 290 300 

PIPPAPVPIP PVPPLPPVPN KIPPAPPAPP VAVAAVLVAP CPPLPPLPNN HPPAPPAAPV 

310 320 330 340 350 360 

PGVPLAPLPN SHPPAPPSAP VPGVPLAPLP ISGRPVSVWK GSFTTLSTFC CRVCSGEVLA 

370 380 390 400 410 420 

GALNPSRPSR SPLTTTTPAL PAPIPPLPPL PPLPINTAVP PIPPLPPVTA LAPPLPPLAP 

430 440 450 460 470 480 

LPISPGVPPA PPIPPGKPWT TPPLAPAPPE PKTVPVLPPG PSCPPSEKPN PPAPPEPPEP 

490 500 510 520 530 540 

KSSPALPPAP PAPSMPSAVR VPPSPPIPPA PPAAPRASMP ALPPAPPSPP ATRLCPPLPP 

550 560 570 580 590 600 

SPPAPNSPPA PPAPPTPPKL LSANPPCPPV PPAPNRPPAP PAPPAPPELP APPDPPTPPV 

610 620 630 640 650 660 

ANSPPAPPAP PAPPSALPFV NPPAPPTPAA PKSRPALPAA PPAPPAPPVR ATTPPPAPPA 

€70 680 690 700 710 720 

PPAPNSMALP PAPPDPPIPL LATPPAPPAP PLPMSPPAPP LPPAAPDPPA PPLTINQPPS 

730 740 750 760 770 780 
PPLAPVPGAP LAPLPINGRP VFARKNSLIG SSSGDTAAAS AAA* . 
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A glycine rich protein of about 77kDa. 

Molecular-weight 77056 ; Length 899 

THE AMINO ACID SEQUENCE IS GIVEN BELOW: 

10 20 30 40 50 60 

STAIAALFGA HGQAYQALSA QAQAFHAQFV QALT5GGGAY AAAEAAAVSP LLDPINEFFL 
70 80 90 100 110 120 

ANTGRPLIGN GANGAPGTGA NGGDGGWLIV NGGAGGSGAA GGNGGAGGLI GNGGAGGAGG 

130 140 150 160 170 180 

VASSGIGGSG GAGGNAMLFG AGGAGGAGGG VVALTGGAGG AGGAAGNAGL LFGAAGVGGA 

190 200 210 220 230 240 

GGFTNGSALG GAGGAGGAGG LFATGGVGGS GGAGSSGGAG GAGGAGGLFG AGGTGGHGGF 

250 260 270 280 290 300 

ADSSFGGVGG AGGAGGLFGA GGEGGSGGHS LVAGGDGGAG GNAGMLALGA AGGAGGIGGD 

310 320 330 340 350 360 

GGTLTADCID GAGGAGGNAG LLFGSGGSGG AGGFGFSDGG QDGPGGNTGT VFGSGGAGAN 

3? 0 380 390 400 410 420 

GGWQGFPGG IGGAGGTPGL IGNGANGGNG GASAVTGGNG GIGGTAVLIG NGGNGGSGGI 

430 44 0 450 460 470 480 

GAGKAGVWV SGLLLGLDGF NAPASTSPLH TLQQNVLNW NEPFQTLTGR PLIGNGANGT 

490 500 510 520 530 540 

PGTGAOGGAG GWLFGNGANG TPGTGAAGGA GGWLFGNGGN GGHGATNTAA TATGGAGGAG 

550 560 570 580 590 600 

GILFGTGGNG GTGGIGTGAG GIGGAGGAGG VSLLIGSGGT GGNGGNSIGV AGIGGAGGRG 

610 "0 630 640 650 660 

GDAGLLFGAA GTCGHGAAGG VPAGVGGAGG NGGLFANGGA GGAGGFNAAG GNGGNGGLFG 

670 680 690 700 710 720 

TGGTGGAGTN FGAGGNGGNG CLFGAGGTGG AAGSGGSGIT TGGGGHGGNA GLLSLGASGG 

730 7 <0 750 760 770 780 

AGGSGGASSL AGGAGGTGGN CALLFGFGGA CCAGGItGGAA LTSIQQGGAG GAGGNGGLLF 

790 800 010 820 830 840 

GSAGAGGAGG SGANALGAGT GGTGGDGGHA GVFGNGGDGG CRRVWRRYRR QRWCRRQRRA 

850 860 870 880 890 900 

DRQRRQRRQR RQSRGHARCR RHRRAAARRE RTQRLAIAGR PATTRGVEGI SCSPQMMP* 



I 

4 
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Amino acid sequence of the about 3 kDa proline 
rich protein 

Molecular-weight 9356 ; Length 98 
AMINO ACID SEQUENCE IS GIVEN BELOW: 

10 20 30 40 50 60 

MALPPAPPDP PIPLLATPPA PPAPPLPMSP PAPPLPPAAP DPPAPPLTIN QPPSPPLAPV 

70 80 90 100 110 120 
PGAPLAPLPI NGRPVFARKN SLIGSSSGDT AAASAAA* 



WO 97/41252 
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Fig. 16 



Proline rich protein of about 55kDa 
Molecular-weight 55982 ; Length 573 
AMINO ACID SEQUENCE IS GIVEN BELOW. 



10 


20 


30 40 


50 


60 


VPAAPKSKPA 


SPPRPPAPPM 


PATPMEFPPL PPVPPDPISK ETPPAPPAPP 


IPPAPVPIPP 


70 


80 


90 100 


110 


120 


VPPLPPVPNK 


IPPAPPAPPV 


AVAAVLVAPC PPLPPLPNNH 


PPAPPAAPVP 


GVPLAPLPNS 


130 


140 


150 160 


170 


180 


HPPAPPSAPV 


PGVPLAPLPI 


SGRPVSVWKG SFTTLSTFCC 


RVCSGEVLAG 


ALNPSRPSRS 


190 


200 


210 220 


230 


240 


PLTTTTPALP 


APIPPLPPLP 


PLPINTAVPP IPPLPPVTAL 


APPLPPLAPL 


PISPGVPPAP 


250 


260 


270 280 


290 


300 


PIPPGKPWTT 


PPLAPAPPEP 


KTVPVLPPGP SCPPSEKPNP 


PAPPEPPEPK 


SSPALPPAPP 


310 


320 


330 340 


350 


360 


APSMPSAVRV 


PPSPPIPPAP 


PAAPRASMPA LPPAPPSPPA 


TRLCPPLPPS 


PPAPNSPPAP 


370 


380 


390 400 


410 


420 


PAPPTPPKLL 


SANPPCPPVP 


PAPNRPPAPP APPAPPELPA 


PPDPPTPPVA 


NSPPAPPAPP 


430 


440 


450 460 


470 


480 


APPSALPFVN 


PPAPPTPAAP 


KSRPALPAAP PAPPAPPVRA TTPPPAPPAP 


PAPNSMALPP 


490 


500 


510 520 


530 


540 


APPDPPIPLL ATPPAPPAPP LPMSPPAPPL PPAAPDPPAP 


PLTINQPPSP 


PLAPVPGAPL 


550 


560 


570 580 


590 


600 


APLPINGRPV 


FARKNSL I GS 


SSGDTAAASA AA* 







